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TITLE 

Capturing Graphics Primitives Associated With A Display Object Rendered To A 

Graphical User Interface 

TECHNICAL FIELD 

This invention relates generally to methods and systems for retrieving information 
descriptive of graphic elements that are displayed to a user interface and, more 
particularly, relates to methods and systems for retrieving graphics primitives and 
associated attributes of such graphics primitives of a display object rendered to a user 
interface. 

BACKGROUND OF THE INVENTION 

To successfully compete in the global market, a company's advertising literature 
and products must be easily understood by everyone, regardless of language or cultural 
differences. This requirement is perhaps most apparent within the vast software 
technology market, where software tools such as Microsoft Word, are commonplace on a 
global scale and must be comprehensible to users of any culture. The need for the 
accurate representation of language within software products marketed and sold 
worldwide is the essence of the localization industry. Localization of a product is the 
accurate translation and adaptation of any software or executable/viewable code into the 
language of the locality into which the product is being marketed and sold. 

In order to work effectively across geographic and cultural borders, a localized 
software product must have the highest quality translation from a source language to a 
native or local language while retaining the functionality of the original product. The 
layout of text within a graphical user interface (GUI) is one of the biggest obstacles to 
overcome when localizing a product because the localized version of code must have the 
same general appearance and meaning as the original. This requires that the localization 
tool used to perform the translation be able to completely access and receive as input all 
of the graphics primitives that are rendered to the user interface screen during the 
execution of the software application. A graphics primitive is a drawing element, such as 
a text character, line, arc, polygon, etc., that is drawn to a user interface according to the 



specific function calls and mechanisms of the operating system. Each graphics primitive 
has its own set of attributes that define its appearance and/or style within the user 
interface. These attributes include visual and stylistic characteristics such as the text style 
or font, line length and style, arc length, etc. In GUI based software applications, 
multiple graphics primitives are combined to create the various display objects (e.g. 
buttons, menus, dialogue boxes, etc.) that are displayed to the user when they are using 
the application. 

Because a typical application can include many different display objects, the 
graphics primitives that comprise the objects provide the primary information and data to 
be localized. For instance, a display object such as a dialog box can include a data entry 
field, user buttons containing text characters or strings, and/or other graphics primitives. 
To properly translate the text strings, the localization tool must be able to access all of the 
text primitives for the dialog box. Likewise, the specific attributes of the button, such as 
the length and shape of the button, must be known in order to account for changes in the 
length of a string due to translation. Once the graphics primitives that comprise the 
various display objects are determined, they can be localized accordingly, and the user 
interface of the application as a whole can be modified to suit the intended locality. 

The graphics primitives that comprise the various display objects within an 
application can be accessed by conventional means. In Windows based applications for 
example, the graphics primitives are indicated by a resource file (*.res) that is stored 
within the application's executable (*.exe). Resource files are simply plain text scripts 
that indicate the various resources required for the application to run successfully, and 
can be viewed with a standard resource editor/viewer tool. The resource files are 
converted into a binary representation at compile time, and then merged into the 
executable image of the application during runtime. Resource types include text string 
tables, which contain the various text strings that are displayed by the application during 
runtime. Other resource types often required by an application include menus, dialog 
boxes, cursors, icons, toolbars, bitmaps and other display objects composed of one or 
more graphics primitives. The resource files provide access to all of the display objects, 
and consequently the graphics primitives associated with the application. 



Despite the extensive information provided from the resource files, however, 
many localization errors still occur because one or more text strings are missed during the 
localization process. This is because standard methods of capturing graphics primitives 
and associated attributes of such graphics primitives are limited to only those display 
objects that are standard objects of the operating system (OS). Yet, there are many GUI 
based software applications that contain "custom-class" or "owner-draw" controls. These 
types of controls represent customized display objects that perform special functions or 
that have attributes that differ from the standard set of objects provided by the OS. So, 
while these customized objects are indicated as resources of the application within the 
resource file, the specific graphics primitives and associated attributes of the primitives 
that comprise the objects cannot be obtained directly from the resource file for 
localization. Rather, the primitives of customized display objects cannot be revealed 
until the object is invoked by the application during runtime. 

Access to the graphics primitives that comprise the various display objects within 
the application, however, is still not enough to ensure a literal translation of a software 
product. The localization tool must also be able to know where and how the various text 
strings indicated in the resource file are used within the application. As described, the 
resource file indicates all of the graphics primitives relative to the executable application, 
and includes a text string table that contains the various literal strings and text characters 
displayed by the application during runtime. While the strings within the table can be 
easily accessed and localized accordingly, the table does not explicitly indicate the 
display object that a particular string corresponds to. The actual usage, or context of the 
string cannot be determined until it is displayed by the application during runtime. 
Context refers specifically to any information that allows the localization tool to account 
for the differences in meaning that occur when the same string or phrase is displayed in 
different ways within the application. For example, the term 'O.K.' may have a different 
meaning as it appears in a dialogue box than in a pull-down menu. 

In addition to having accurately translated strings that are used in the correct 
context, the localized product must also maintain the same font properties as the original 
application. For instance, a button within the original application having text that reads 
"EXIT' should read as "SALIDA" when localized for Spanish speaking users. The 
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literal meaning of the text as well as the font properties, which in this case are Times 
New Roman, bold and italicize to name a few, should be maintained from one version to 
the next. However, if the localized button reads as "SALIDA", the intentional emphasis 
placed on the original text is lost. This can cause problems in applications where varying 
5 font sizes, typeface, and styles are required to effectively convey information to a user of 
the application. Unfortunately, there is no convenient way for the font properties of text 
strings to be captured during the localization process, such as from the resource file. This 
is because the font properties (which are attributes of a text primitive) are generally 
stored within a temporary data structure allocated for the string known as a device 
10 context. The information maintained within this data structure, including the font 
properties, is discarded by the application from memory after the text is drawn to the 
screen. Again, this information can only be determined during the actual runtime of the 
application. 

To overcome the limitations discussed above, a way is needed to easily access the 

15 graphics primitives and associated attributes of any display object (standard and non- 
standard) called during the execution of an application. Likewise, a convenient means of 
determining the context of the text strings that get displayed to a user during the 
execution of the application is necessary to ensure that the text strings are associated with 
the correct display object. A way to capture the font properties of a text string or 

20 character is also needed so that this information is made available with the other 
attributes of a graphics primitive. 

SUMMARY OF THE INVENTION 
The present invention provides a mechanism for capturing the one or more 
graphics primitives associated with an application as it is in execution. Moreover, the 

25 invention allows for the detection and retrieval of the unique attributes of the one or more 
graphics primitives as they are drawn to a graphical user interface. These graphic 
capturing techniques can be applied directly to any controls, buttons, windows and/or any 
other display objects that can be invoked by an application, including those that are 
custom drawn or non-standard with respect to the operating system. 

30 A calling process, such as a localization tool, utilizes the invention to capture 

graphics primitives, such as text strings, that are displayed to the screen by a target 
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process during runtime. The target process is any computer implemented process or 
software application that requires a graphical user interface (GUI) to display visual 
information to a computer user. In operation, the calling process invokes an injection 
DLL (Dynamic Link Library) to inject a spy DLL into the executable code of the target 
5 process. Once the spy DLL is injected, it installs patches and hook functions into the 
operating system API's (Application Programming Interfaces) that have routines for 
displaying text to a graphical user interface. The hook functions monitor the operating 
system messages generated during the execution of the target application, while the 
patches allow for the capture of the various graphics primitives and associated attributes 
10 of the primitives that are drawn to the user interface. 

Whenever a display object is rendered to the GUI by the target application as a 
result of an invoked action (e.g. mouse-clicking, function key), the hook functions are 
called to capture the operating system messages passed and the patches capture the 
graphics primitives of the object. The patches also capture the unique attributes of the 
1 5 graphics primitives, including the font properties of a displayed text. This captured 
information is then packaged and delivered to the calling process for processing. 
Because the graphics primitives are captured in connection with the operating system 

B messages passed during runtime, the calling process obtains complete information about 

Q 

fp any viewable or executable objects displayed by the target process. The operating system 

rj 20 messages provide a context for a captured graphics primitive, which allows the calling 
process to better associate a captured primitive with a specific display object. As an 
example, a text string primitive can be easily associated with a specific dialogue box that 
is called by the application as a result of a user action. Furthermore, the invention allows 
the graphics primitives and associated attributes of custom/user drawn objects to be 
25 captured. This overcomes the limitations imposed by the operating system on allowing 
the unique attributes of non-standard objects to be exposed by the resource file for the 
application. 

Additional features and advantages of the invention will be made apparent from 
the following detailed description of illustrative embodiments that proceeds with 
30 reference to the accompanying figures. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

While the appended claims set forth the features of the present invention with 
particularity, the invention, together with its objects and advantages, may be best 
understood from the following detailed description taken in conjunction with the 
accompanying drawings of which: 

Figure 1 is a block diagram generally illustrating an exemplary computer system 
on which the present invention resides; 

Figure 2 is a diagram illustrating the target application within a user interface 
having a dialog box spawned as a result of an invoked action within the target 
application; 

Figure 3 is a functional block diagram illustrating the major components of the 
invention; 

Figure 4 is a flow chart illustrating the method executed by the calling process for 
capturing graphics primitives and system messages generated by the target process; and 

Figure 5 is a diagram illustrating a data structure containing information 
descriptive of the graphics primitives that comprise a display object. 

DETAILED DESCRIPTION OF THE INVENTION 

Turning to the drawings, wherein like reference numerals refer to like elements, 
the invention is illustrated as being implemented in a suitable computing environment. 
Although not required, the invention will be described in the general context of 
computer-executable instructions, such as program modules, being executed by a 
personal computer. Generally, program modules include routines, programs, objects, 
components, data structures, etc. that perform particular tasks or implement particular 
abstract data types. Moreover, those skilled in the art will appreciate that the invention 
may be practiced with other computer system configurations, including hand-held 
devices, multi-processor systems, microprocessor based or programmable consumer 
electronics, network PCs, minicomputers, mainframe computers, and the like. The 
invention may also be practiced in distributed computing environments where tasks are 
performed by remote processing devices that are linked through a communications 
network. In a distributed computing environment, program modules may be located in 
both local and remote memory storage devices. 



Figure 1 illustrates an example of a suitable computing system environment 1 00 
on which the invention may be implemented. The computing system environment 100 is 
only one example of a suitable computing environment and is not intended to suggest any 
limitation as to the scope of use or functionality of the invention. Neither should the 
computing environment 1 00 be interpreted as having any dependency or requirement 
relating to any one or combination of components illustrated in the exemplary operating 
environment 100. 

The invention relates to the capture and retrieval of the one or more graphics 
primitives that comprise a display object rendered to a graphical user interface by an 
application executing on the operating system 100. In the context of a graphical user 
interface, a display object is any drawing element that can be viewed by a user of a 
computer from a user interface screen (monitor) 191. More specifically, a display object 
refers to any drawing element that displays lines, text, images and other visible 
information to a graphical user interface (GUI). This includes, but is not limited to 
control and dialogue boxes, functional buttons, menu screens, combo boxes, and any 
graphical windows that are capable of being executed or invoked by an application 
running on the operating system 144. The graphics primitives that comprise the display 
object can be easily captured by practicing the methods of the invention, and 
subsequently returned to a calling process such as a localization tool or text-to-speech 
application for processing. 

As described herein, the term "process" refers to an executable procedure, 
computing task, or part of a program that is executed by the computer 1 10. More 
specifically, a "calling process" is the process that utilizes the invention to capture the 
one or more graphics primitives of a display object that can be invoked by the various 
application programs 145 on the computer. The graphics primitives include drawing 
elements such as text characters or strings, lines, arcs, polygons, etc., and have associated 
attributes that define its visual appearance such as font size, line length, and arc length. 
The calling process can be any computer executable process, such as a localization tool 
or text-to-speech application that requires access to the drawing elements that are drawn 
to a graphical user interface by an application. Similarly, a "target application" or "target 
process" refers to the particular process, executable procedure, or application that a 
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particular text string is to be captured from. In accordance with the invention, the calling 
process captures the one or more graphic primitives and associated attributes of the 
primitives that are displayed to a graphical user interface by the target 
application/process. 

5 Display objects are commonly used within operating systems that support a 

graphical user interface. For example, in the Windows operating system, a standard set 
of display objects are available for use by the various application programs 145 that 
execute on the computer 110. In this way, a dialog box or menu display used by 
Microsoft Word for example, can also be called upon by Microsoft Excel to provide the 
10 same functional purpose. Standard display objects have a mode of operation that is 
defined according to the functions and mechanisms of the operating system, while the 
appearance of the display object is defined according to one or more graphics primitives 
that comprise the object. For instance, multiple lines, curves, and text characters can be 
combined to yield a user button. The appearance of the button is further determined by 
S 1 5 the specific attributes of the graphics primitives such as the font type, text alignment and 
y0 placement, line thickness and shading, style, etc. Such graphics primitives and associated 

^ attributes can be easily determined through conventional methods, such as by accessing 

^ the resource file of the target application. However, access to the unique attributes of 

Qi objects that are non-standard with respect to the operating system cannot be determined 

Ml 20 through conventional means. 

Q In contrast, the present invention allows for the capture of the graphics primitives 

and associated attributes of both standard and non-standard display objects. Non- 
standard display objects typically fall into two categories, namely owner-draw controls or 
custom class objects. Owner-draw controls are display objects that have a customized 
25 style or appearance and perform tasks that differ from that of the standard objects. 

Similarly, custom class objects consist of unique functions or class names that are not 
recognized by the OS. Because these types of objects are not standard with respect to the 
operating system, their unique attributes associated with these objects cannot be 
determined by conventional methods. For instance, the resource file of an application 
30 having an owner-draw or custom class control does not reveal the specific graphics 

primitives or attributes of the object. Customized features cannot be determined directly 



from the static (non-executed) resource file. Rather, they can only be realized once the 
display object is invoked during the runtime of the application. Current methods of 
capturing graphics primitives are limited to the information contained within the "static" 
resource file, and do not support the capture of "dynamic" (runtime) information. The 
invention overcomes this limitation by allowing the graphics primitives and unique 
attributes of customized display objects to be captured during runtime execution of the 
application. 

Still further, the invention also relates to a method and system for capturing 
"context information" associated with any text that is displayed during the execution of a 
process or application. In general, context information is information that is descriptive 
of the display object in which the text appears. This descriptive information can include 
parameters such as the type of display object (dialog box, menu, window, etc.) and its 
current state (active/inactive). Context information also includes system information 
such as the API calls and/or function calls made by the target application to render the 
display object to a user interface, the object handle or resource ID, the specific location of 
files called during execution of the display object, and any other information that 
provides a general context for the text that is displayed to the user interface screen 191 
during the execution of the target process or application. Context information is obtained 
by intercepting the system messages that are passed between the target application and 
the operating system during the execution of the application. Capturing the text in 
connection with the operating system messages provides a general context for the text, 
and allows the information to be associated with a specific task or process within the 
target application. 

The computing system environment 100 includes a general purpose computing 
device in the form of a computer 110. Components of computer 110 may include, but are 
not limited to, a processing unit 120, a system memory 130, and a system bus 121 that 
couples various system components including the system memory to the processing unit 
120. The system bus 121 may be any of several types of bus structures including a 
memory bus or memory controller, a peripheral bus, and a local bus using any of a 
variety of bus architectures. By way of example, and not limitation, such architectures 
include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) 



bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local 
bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus. 

Computer 1 10 typically includes a variety of computer readable media. Computer 
readable media can be any available media that can be accessed by computer 110 and 
includes both volatile and nonvolatile media, removable and non-removable media. By 
way of example, and not limitation, computer readable media may comprise computer 
storage media and communication media. Computer storage media includes volatile and 
nonvolatile, removable and non-removable media implemented in any method or 
technology for storage of information such as computer readable instructions, data 
structures, program modules or other data. Computer storage media includes, but is not 
limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD- 
ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, 
magnetic tape, magnetic disk storage or other magnetic storage devices, or any other 
medium which can be used to store the desired information and which can be accessed by 
computer 110. Communication media typically embodies computer readable 
instructions, data structures, program modules or other data in a modulated data signal 
such as a carrier wave or other transport mechanism and includes any information 
delivery media. The term "modulated data signal" means a signal that has one or more of 
its characteristics set or changed in such a manner as to encode information in the signal. 
By way of example, and not limitation, communication media includes wired media such 
as a wired network or direct-wired connection, and wireless media such as acoustic, RF, 
infrared and other wireless media. Combinations of the any of the above should also be 
included within the scope of computer readable media. 

The system memory 130 includes computer storage media in the form of volatile 
and/or nonvolatile memory such as read only memory (ROM) 131 and random access 
memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic 
routines that help to transfer information between elements within computer 1 10, such as 
during start-up, is typically stored in ROM 131. RAM 1 32 typically contains data and/or 
program modules that are immediately accessible to and/or presently being operated on 
by processing unit 120. By way of example, and not limitation, Figure 1 illustrates 
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operating system 134, application programs 135, other program modules 136, and 
program data 137. 

The computer 1 10 may also include other removable/non-removable, 
volatile/nonvolatile computer storage media. By way of example only, Figure 1 
illustrates a hard disk drive 140 that reads from or writes to non-removable, nonvolatile 
magnetic media, a magnetic disk drive 151 that reads from or writes to a removable, 
nonvolatile magnetic disk 152, and an optical disk drive 155 that reads from or writes to a 
removable, nonvolatile optical disk 156 such as a CD ROM or other optical media. Other 
removable/non-removable, volatile/nonvolatile computer storage media that can be used 
in the exemplary operating environment include, but are not limited to, magnetic tape 
cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, 
solid state ROM, and the like. The hard disk drive 141 is typically connected to the 
system bus 121 through a non-removable memory interface such as interface 140, and 
magnetic disk drive 151 and optical disk drive 155 are typically connected to the system 
bus 121 by a removable memory interface, such as interface 150. 

The drives and their associated computer storage media discussed above and 
illustrated in Figure 1 , provide storage of computer readable instructions, data structures, 
program modules and other data for the computer 110. In Figure 1, for example, hard 
disk drive 141 is illustrated as storing operating system 144, application programs 145, 
other program modules 146, and program data 147. Note that these components can 
either be the same as or different from operating system 134, application programs 135, 
other program modules 136, and program data 137. Operating system 144, application 
programs 145, other program modules 146, and program data 147 are given different 
numbers here to illustrate that, at a minimum, they are different copies. A user may enter 
commands and information into the computer 20 through input devices such as a 
keyboard 162 and pointing device 161, commonly referred to as a mouse, trackball or 
touch pad. Other input devices (not shown) may include a microphone, joystick, game 
pad, satellite dish, scanner, or the like. These and other input devices are often connected 
to the processing unit 120 through a user input interface 160 that is coupled to the system 
bus, but may be connected by other interface and bus structures, such as a parallel port, 
game port or a universal serial bus (USB). A monitor 191 or other type of display device 
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is also connected to the system bus 121 via an interface, such as a video interface 190. In 
addition to the monitor, computers may also include other peripheral output devices such 
as speakers 1 97 and printer 1 96, which may be connected through an output peripheral 
interface 190. 

The computer 1 10 may operate in a networked environment using logical 
connections to one or more remote computers, such as a remote computer 180. The 
remote computer 1 80 may be another personal computer, a server, a router, a network 
PC, a peer device or other common network node, and typically includes many or all of 
the elements described above relative to the personal computer 110, although only a 
memory storage device 181 has been illustrated in Figure 1. The logical connections 
depicted in Figure 1 include a local area network (LAN) 171 and a wide area network 
(WAN) 173, but may also include other networks. Such networking environments are 
commonplace in offices, enterprise-wide computer networks, intranets and the Internet. 

When used in a LAN networking environment, the personal computer 1 10 is 
connected to the LAN 171 through a network interface or adapter 170. When used in a 
WAN networking environment, the computer 1 1 0 typically includes a modem 1 72 or 
other means for establishing communications over the WAN 173, such as the Internet. 
The modem 172, which may be internal or external, may be connected to the system bus 
121 via the user input interface 160, or other appropriate mechanism. In a networked 
environment, program modules depicted relative to the personal computer 1 10, or 
portions thereof, may be stored in the remote memory storage device. By way of 
example, and not limitation, Figure 1 illustrates remote application programs 185 as 
residing on memory device 181. It will be appreciated that the network connections 
shown are exemplary and other means of establishing a communications link between the 
computers may be used. 

With reference now to Figure 2, a target application 202 with a display object 204 
is shown. The display object is spawned as a result of an invoked action within the target 
application, such as from clicking one of the menu items 212 with the mouse 161 . 
Operating systems that support GUIs such as Windows, contain numerous types of 
display objects that are presented to a user interface 200, including user buttons 206, 
menu items 210 and dialogue boxes 204. In Figure 2, the display object is a dialogue box 
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204 consisting of three user option buttons 206, where one of them 210 is owner drawn. 
The dialogue box 204 also consists of a text field 208 for displaying text characters, 
having defined font properties to the GUI. All of the buttons 206 shown within the 
dialogue box and the associated text field 208 are child controls of the dialogue box. A 
child control can be best described as a display object that is initiated by a parent object. 
In the illustration, the dialogue box 204 is the parent object, and has four children 
associated with it. Each of the objects has its own handle, which is a unique 
identification number assigned by the operating system 144 that distinguishes one object 
from another. The handles of the various display objects are also indicated in the 
resource file associated with the executable application. Hence, the dialogue box has its 
own unique handle, and each of its children has its own handle, which in many operating 
systems is inherited directly from the parent object. 

It is common practice amongst software developers to obtain information related 
to a particular display object from the resource file, or by accessing the handle of the 
object directly. Once the handle of the object is known, the developer can view the 
graphics primitives and associated attributes of the object directly from the resource file. 
Examples of information that can be viewed include the various controls within the 
display objects, the placement of the object within the target application or parent 
window, the line properties, etc. However, while the graphics primitives of a display 
object may be obtained in this way, it is not possible to ascertain the font properties of 
display objects directly. This is because the font properties, which are attributes of a text 
primitive, are maintained within a temporary data structure provided by a device context. 
This data structure is not indicated within the resource file, and not maintained within 
memory after the text it relates to is drawn to the screen. Therefore, the only way to 
access the font properties of a text string or character is to instantly capture the font 
information from the data structure as it is drawn to the graphical user interface. The 
invention allows the font properties to be captured during runtime of the application. 

Specifically, with reference to Figure 3 and the flowchart of Figure 4, a calling 
process 300 injects DLLs into a target application 302 to "spy" on or monitor the 
operating system messages generated during target application execution. This technique 
is useful for determining the internal state of an application as it is in operation. A spy 
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component 308 is used to perform the spying. It is an executable program module, such as a 
DLL, consisting of various functions and data for monitoring an application or process as 
it is in execution. The spy component 308 is injected into the target process or 
application 302 via an injection component 306 (event 406). This injection component 
5 306 is invoked by the calling process 300 (event 404), which is an executable program 
module, such as a DLL, having executable instructions for injecting source code and/or 
program modules into a target process 302. Once the spy component is injected 3 10, it 
installs function patches into the operating system's application programming interfaces 
(APIs) that have executable instructions and routines for outputting graphics primitives to 
10 a user interface (event 410). For example, the primary API for Windows, known as 

Win32, provides a complete set of functions to format and draw text primitives in a target 
application's 302 user interface 200 (e.g.). These functions, such as DrawText(), 
p TextOut(), and PolyTextOut() are responsible for outputting individual characters, 

ire 

5; symbols, or entire strings of text to the screen. Within the Windows environment, these 

J3 15 are the functions that are patched by the spy component 308 after injection (event 410). 
yrj The function patches serve the specific purpose of capturing the text that is rendered to 

J the user interface screen 191 during the execution of the target process or application 

L 302 - 

rn In addition to installing the function patches, the spy component installs one or 

\j\ 20 more hook functions into the operating system's APIs to monitor system messages passed 
0 during execution of the target application/process 302 (event 412). Hook functions 

intercept system messages that are passed between the operating system and system 
threads during process execution. Whenever an action such as the pressing of a function 
key, click of a mouse, or activation of a dialogue box occurs, the OS generates a message 
25 that passes through a chain of hook procedures before reaching the target process. 
Standard hook functions are utilized to monitor the system messages that pertain 
specifically to the output of display objects to a user interface screen 191 . In the 
Windows operating system for example, the WH_MSGFILTER and 
WH_SYSMSGFILTER hooks are used to monitor messages that are to be processed by a 
30 menu, scroll bar, message box, or dialog box. 
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The hook functions are installed and uninstalled accordingly by a hook 
management component 304, which is called upon by the spy component after it is 
injected 310 into the target process. The hook management component is a separate 
process from the calling process and target process, and has executable instructions for 
5 installing and uninstalling hook functions within a process or application designated by 
the spy component. In accordance with the invention, the designated process is the target 
process or application. Because the hook management component operates as an 
independent process, one or more calling processes can use the hook management 
component to install hooks within one or more target processes. In this way, the system 
10 messages of multiple independently running processes or applications can be monitored 
during runtime. As soon as an action is invoked within the target process 302 that results 
in the invocation of a display object, the generated message is captured by the one or 
more hook functions (event 414). This action results in the release of a trigger, or flag, 
that activates the function patches to capture the graphics primitives that are drawn to the 
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yD 1 5 screen to render the display object (event 4 1 6). 



After enabling the function patches, the system messages captured by the one or 
more hook functions are then used to invalidate the display object under execution (event 
418). The invalidation of a display object is a process whereby a selected or active 
display object is redrawn due to a user or system invoked change to the object. For 

20 instance, a display object is invalidated each time a user resizes the display object or 
moves it to a different position within the user interface 200. When such an action 
occurs, the display object is redrawn by calling the same API functions and routines that 
rendered it to the interface screen the first time. By using the system messages (which 
provide a context for the captured graphics primitives) captured by the hook functions, 

25 the appropriate API function calls are made, which in turn enumerates the display objects 
to the screen. As the display box is redrawn/invalidated, the installed function patches 
capture the graphics primitives, such as the text and other drawing elements that are 
drawn to the interface screen 200 (event 420). Thus, the graphics primitives and 
associated attributes of the graphics primitives related to the display object are captured 

30 in association with the context information provided by the system messages. 
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Finally, after the hook functions and function patches capture the graphics 
primitives including the font, and context information accordingly, this information is 
packaged into a data structure and sent to the calling process 300 as a system message 
(event 422 and 424). The process of capturing the graphics primitives and context 
5 information is continued for each invoked action within the target process, until 

execution is terminated. Because the hook functions and patches capture the runtime 
resource information on demand, the calling process receives this packaged information 
in an enumerated format. 

Figure 5 is illustrative of the data structure containing the captured graphics 
10 primitives, associated attributes and context information related to the dialogue box 204 
of Figure 2. While this information is captured and retrieved in a similar format as 
shown in the figure, those skilled in the art will recognize that the actual data, and format 
of the data is dependent upon the type of action invoked within the target process. 
Moreover, the information contained within the data structure will differ from one type of 
1 5 display object to another, as each object can consist of various buttons, text fields, and 
other drawing elements. Therefore, the information that is captured and stored within the 
data structure will vary from one process to another. 

2 In Figure 5, the captured information includes data and parameters that are 

O 

01 descriptive of the display object 204 both functionally and graphically. The object type 

ess ; 

20 as shown is a DIALOGUE_BOX 500, which further consists of other data and 

parameters that define its appearance and operation. These parameters include the 
object's four child controls, namely TEXT_FIELD 502, OWNERDRAWN BUTTON 
504, BUTTON_l 506, and BUTTON 2 508 (these names were assigned for illustrative 
purposes only). As shown in Figure 2, each of the controls is a user button, and one of 
25 the buttons is owner drawn 210. Each button is a display object that is composed of one 
or more graphics primitives, such as text characters and lines that define its shape. 
Likewise, the primitives have unique attributes, such as font type, text size, line length 
and other stylistic characteristics. The sequential format of the information as shown in 
Figure 5 is due to the function patches that capture the graphics primitives and associated 
30 attributes. As the display objects of the target application are enumerated to the screen, 
the graphics primitives and attributes of such primitives are instantly captured, stored in 
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m 

't] | , .; 



17 

the data structure in the order in which the various attributes are drawn to the screen, and 
returned to the calling process. The ability to capture the graphics primitives and 
associated attributes of the graphics primitives in connection with the operating system 
messages passed during process execution (context information), results in the return of 
5 complete graphic information to the calling process. 

The invention as described herein can be incorporated into the source code of a 
calling process directly, or called upon by the calling process to capture and retrieve the 
graphics primitives and associated context information of a display object. The calling 
process can be any computer executable application where the capture of the various 
10 graphics primitives that are output to a user interface by a target process is required. 

Once captured, the calling process can process this information accordingly. Examples 
of applications that can practice the capturing techniques disclosed include, but are not 
Q limited to localization tools, language processing applications and text-to-speech 

5? applications. Also, while the methods disclosed are applicable to various operating 

*Jj 15 systems and platforms, the ability to capture graphics primitives and associated attributes 

n i 

in connection with context information can be particularly useful within the Windows OS 
y * to better support Active Accessibility applications. These types of applications are 

e commonly used to make computer applications accommodating for people with physical 

rii 

5* disabilities, such as blindness or restricted mobility. By integrating the ability to capture 

[** 20 graphics primitives and attributes of such primitives from a target process, Active 
Q Accessibility can be better supported within applications having display objects that are 

not native to the operating system itself (e.g. owner-draw controls). Indeed, the invention 
can be practiced in any system that requires or desires instant access to typographical or 
visual information from any display object that can be rendered to a user interface. 
25 In this description, the invention is described with reference to acts and symbolic 

representations of operations that are performed by one or more computers such as the 
computer 110, unless indicated otherwise. As such, it will be understood that such acts 
and operations, which are at times referred to as being computer-executed, include the 
manipulation by the processing unit of the computer of electrical signals representing 
30 data in a structured form. This manipulation transforms the data or maintains it at 

locations in the memory system of the computer, which reconfigures or otherwise alters 



the operation of the computer in a manner well understood by those skilled in the art. 
The data structures where data is maintained are physical locations of the memory that 
have particular properties defined by the format of the data. However, while the 
invention is being described in the foregoing context, it is not meant to be limiting as 
those of skill in the art will appreciate that the various acts and operations described 
hereinafter may also be implemented in hardware. 

All of the references cited herein, including patents, patent applications, and 
publications, are hereby incorporated in their entireties by reference. In view of the many 
possible embodiments to which the principles of this invention may be applied, it should 
be recognized that the embodiment described herein with respect to the drawing figures is 
meant to be illustrative only and should not be taken as limiting the scope of invention. 
For example, those having skill in the art will recognize that the elements of the 
illustrated embodiment shown in software may be implemented in hardware and vice 
versa or that the illustrated embodiment can be modified in arrangement and detail 
without departing from the spirit of the invention. Therefore, the invention as described 
herein contemplates all such embodiments as may come within the scope of the following 
claims and equivalents thereof. 



